Om : One tool for many ( Indian ) languages
نویسنده
چکیده
Many different languages are spoken in India, each language being the mother tongue of tens of millions of people. While the languages and scripts are distinct from each other, the grammar and the alphabet are similar to a large extent. One common feature is that all the Indian languages are phonetic in nature. In this paper we describe the development of a transliteration scheme Om which exploits this phonetic nature of the alphabet. Om uses ASCII characters to represent Indian language alphabets, and thus can be read directly in English, by a large number of users who cannot read script in other Indian languages than their mother tongue. It is also useful in computer applications where local language tools such as email and chat are not yet available. Another significant contribution presented in this paper is the development of a text editor for Indian languages that integrates the Om input for many Indian languages into a word processor such as Microsoft WinWord. The text editor is also developed on Java platform that can run on Unix machines as well. We propose this transliteration scheme as a possible standard for Indian language transliteration and keyboard entry.
منابع مشابه
OM : “ One Tool for Many ( Indian ) Languages ”
A large number of different languages are spoken in India, each language being the mother tongue of tens of millions of people. While the languages and scripts are distinct from each other, the grammar and the alphabet are similar to a large extent. One common feature is that all the Indian languages are phonetic in nature. In this paper we describe the development of a transliteration scheme O...
متن کاملParts Of Speech Tagging for Indian Languages: A Literature Survey
Part of speech (POS) tagging is the process of assigning the part of speech tag or other lexical class marker to each and every word in a sentence. In many Natural Language Processing applications such as word sense disambiguation, information retrieval, information processing, parsing, question answering, and machine translation, POS tagging is considered as the one of the basic necessary tool...
متن کاملOn minimal realization of IF-languages: A categorical approach
he purpose of this work is to introduce and study the concept of minimal deterministic automaton with IF-outputs which realizes the given IF-language. Among two methods for construction of such automaton presented here, one is based on Myhill-Nerode's theory while the other is based on derivatives of the given IF-language. Meanwhile, the categories of deterministic automata with IF-outputs and ...
متن کاملOn the Status of Object Markers in Bantu languages
Much of this literature has concentrated on chosing between one of two analyses: either the OM is a pure agreement marker, analogous to subject markers/agreement in Bantu and other languages, or it is a cliticized pronoun that counts as the true object of the verb even though it appears on the verb on the surface. On the latter analysis, the overt NP in (1c) is not the true grammatical object o...
متن کاملGrammar Checkers for Natural Languages: a Review
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called ‘Computational Linguistics’ which focuses on processing of natural languages on computational devices. A natural language consists of many sentences which are meaningful linguistic units involving one or more words ...
متن کامل